rank | frequency | n-gram |
---|---|---|
1 | 14803 | -a |
2 | 12843 | -e |
3 | 12229 | -o |
4 | 9988 | -i |
5 | 4775 | -n |
rank | frequency | n-gram |
---|---|---|
1 | 2599 | -on |
2 | 2277 | -to |
3 | 2124 | -te |
4 | 1919 | -ia |
5 | 1851 | -re |
rank | frequency | n-gram |
---|---|---|
1 | 2079 | -ion |
2 | 1341 | -nte |
3 | 718 | -nto |
4 | 715 | -ica |
5 | 690 | -are |
rank | frequency | n-gram |
---|---|---|
1 | 1275 | -sion |
2 | 854 | -ente |
3 | 517 | -ento |
4 | 318 | -ando |
5 | 314 | -enti |
rank | frequency | n-gram |
---|---|---|
1 | 689 | -asion |
2 | 564 | -mente |
3 | 395 | -mento |
4 | 170 | -sioni |
5 | 167 | -ssion |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings